Essential Steps and Practical Applications for Database Studies

As the information that is collected from health care encounters becomes more available, managed care pharmacy will have greater insights on the impact of pharmaceutical policy on patient outcomes. Database studies provide valuable information that depicts actual health care consumption and provides a tool to help manage the health care benefit. As compared with clinical trials, one of the strengths of database studies is that they are nonintrusive to patients and providers. However, the integrity of the data and any subsequent analysis are dependent on accurate and consistent coding practices at the time of data entry into the system. This article describes the 6 main steps required to complete a database study.

are important to ensure that, if an effect deemed to be clinically important exists, then there is a high chance of it being detected, (i.e., that the analysis will be statistically significant). 2 Statistical power is the probability of getting a statistically significant result given that there is a real effect in the population being studied. If a particular test is not statistically significant, it is because there is no effect or because the study design makes it unlikely that a biologically real effect would be detected. Power analysis can distinguish between these alternatives and is therefore a critical component of designing experiments and testing results. 2 (Power depends on many factors such as the type of test, α-level, and effect size; the reader is referred to specific statistical references for more information.) The sample size must be large enough so that the detected change is of scientific significance as well as statistical significance. 3 A study that is too small will not clearly show clinical significance. However, if the study sample is too large, an effect of little clinical importance is statistically detectable. 3

Strengths of Database Studies
Database studies offer several advantages over clinical trials. Most notably, database studies contain valuable information that depicts actual health care resource consumption. Practice and prescribing patterns can be observed without introducing extraneous variables. As database studies are always retrospective, there is never an alteration in individuals' behavior because of awareness of being studied or an increased awareness of their own behavior. Overall, database studies are nonintrusive to patients and providers, and data for large numbers of patients may be handled more quickly and less expensively than in a prospective clinical trial.
Results of retrospective studies are not usually subject to selection bias, which is a systematic error in creating intervention groups. Although patients are included or excluded based on several criteria, the criteria are applied to all patients in the database instead of first selecting the patients, then applying the treatment as in a prospective clinical trial. However, selection bias may be introduced by the inclusion and exclusion criteria used in the study design and whether or not incomplete records are used in the evaluation. For example, a patient who has depression and is not coded as having depression would not be included in a depressiondatabase study. Selection bias may also mean that the participants are not representative of the population of all possible participants.

Study Limitations
Certain aspects of database analysis also provide some challenges to the scientist or health care professional. Factors that pose the greatest challenges involve data collection and storage, areas where the analyst has little or no control. Significant variation in the accuracy and consistency of coding practices and associated data quality present a problem to the database researcher because if there is inconsistency in coding, there is inconsistency in the resulting judgments derived from that data. 4 Accurate data collec-tion relies on input ranging from proper ICD-9 (International Classification of Diseases, 9th edition) and procedure coding to the level of the retail pharmacy, where most outpatient pharmacy claims are generated.
For example, the pharmacy field "days supply" may be used as an indication of patient compliance. However, this field is often an inaccurate marker due to factors such as dosage titration schedules, unknown actual use, and medications used only as needed. Since direct communication with a patient is not practical, when the retrospective analysis shows, for example, potential noncompliance, there is no way to determine the reasons for the observed underutilization in medication therapy.
Database analyses require certain assumptions to be made, which are usually based on clinical knowledge about disease treatments, disease severity, and indicated and typical uses of medications. For example, alterations in practice patterns, such as off-label prescribing for a newly approved medication, need to be taken into consideration. To enable some reasonable assumptions to be made regarding patient types, medication therapy, or disease severity, a workable understanding of the disease being treated is important for effective database research. In the case of asthma, for instance, knowledge about types of inhalers used, usual inhaler use, and their relationship to disease severity will help the researcher to determine controlled or uncontrolled disease.
Other challenges to database studies that arise frequently are the age of the data and interdatabase compatibility. A simple example is a database of pharmacy claims generated from online pharmacy adjudication systems, which is generally available within 1 month. However, when the payer uses multiple vendors for pharmacy processing, or when mail order is required for chronic medications and local retail pharmacy for acute medications, multiple databases will need to be used to complete the study or important information will be missed. Other sources of medical data are somewhat older due to a delay in the claim submission process and are not immediately available for analysis. Many times, integrated databases require reprogramming into a compatible format to be usable. A data set may also need to be cleaned for duplicate records or missing values. These processes add to the age of the data, possibly up to 6 months.
As retrospective databases pose several methodological challenges, it should be noted that a checklist was recently published that can be used by decision makers to evaluate the quality of published studies that use health-related retrospective databases. 5

Database Variables
Variables in a database are pieces of information that can be specified based on the requirements of the study, such as the inclusion and exclusion criteria, patient groups, and the timeline of the study. Variables serve to limit the scope of the study, design the study, and group together information about the patients. They may be simple, such as inclusion criteria for age or insurance coverage, or complex, such as criteria for medication switching within a certain timeline.
The goal of the study will help to determine some of the variables, such as the timeline of data used in the analysis. If the goal of a health plan' s study is to use the results to change patient or prescribing behavior, then a 3-month review of the data may be sufficient, whereas a longer history will produce more accurate results for an analysis of, for instance, morbidity or mortality outcomes associated with a particular intervention.

Proxy Outcomes
Clinical outcomes may be difficult to determine in database studies, as an observational study does not prove a cause-andeffect relationship. Instead, a proxy outcome, such as a hospital admission or an emergency room visit, may be used. Additionally, if medical data are not available, which is the case in most pharmacy benefit management databases, then the addition of a medication to a treatment regimen or medication usage patterns may act as the proxy outcomes.

ss Six Steps to Designing a Database Study
There are 6 main steps for completing a database study ( Table 2). 1 A brief discussion of each of the steps follows.

Step 1: Define the Study Objective
The purpose of the study, or the study objective, identifies the goals of the study and may be thought of with a statement such as, "I want to find out X from this database study." The study objective may be a simple cost assessment, conducted to make an administrative decision, or it might be a comparative pharmacoeconomic outcomes study, designed for external presentation and publication. This initial step may include a quick check to determine the capabilities of the database (e.g., the available data fields) and to ensure that the study design and conclusions are consistent with the database, although this is not considered a formal requirement of the first step. 6 Step 2: Identify the Data Elements Data elements are extracted from the database to help define the patient groups or the other units of comparison. Some examples of data elements are provided in Table 3. In comparison with database variables, which may be modified secondary to the requirements of the study (step 5 provides more information on this concept), the data elements cannot be changed. For ease of conducting the study, it is recommended to define the data elements as clearly as possible.

Step 3: Identify and Apply Specific Inclusion/Exclusion Criteria
To select a subset of eligible patients for further analysis, inclusion/exclusion criteria should be applied after the data have been extracted. The criteria must be consistent with the structure of the database elements. That is, the data elements may include a procedure code for a blood draw but not include the results of the blood draw. Therefore, the criteria cannot be dependent on specific lab results, only if a lab was taken. Each criterion should be applied in a stepwise manner, one criterion at a time, with the results of each application examined so that any adjustments can be made if unexpected results occur.
Database studies require that control groups be created, since patients often receive more than one type of treatment or have more than one diagnosis of interest. If costs and outcomes are to be compared for different groups of patients, the comparison groups need to be created using key data elements to identify the appropriate patients. Control groups must be well matched for disease, diagnoses, comorbid conditions, age, and gender. Differences in patient characteristics (e.g., age, comorbidities, and severity of illness), pharmaceutical therapy, and other important clinical differences between the groups may then be summarized and reported.

Step 4: Perform Initial Data Analyses and Review Preliminary Results
Initial analyses should be conducted to ensure that sufficient numbers of patients still exist after applying the restriction criteria conforming to the predefined power analysis. Data should be summarized for variables that are easy to analyze, such as the number and percentage of patients who meet the inclusion criteria, have the diagnosis of interest, or used the prescribed medications of interest. Other information, such as patient characteristics and patient counts, should also be reviewed as part of the initial analysis. Steps to Design a Database Study Step 5: Create "Calculated" Analyses Variables Some complex analyses can be made easier by creating new calculated variables to facilitate data analysis and summarization. A common example is the calculation of patients' ages at a certain time by subtracting each person' s date of birth from their date of hospitalization. Another example might be the calculation of persistence by identifying refill dates and amount dispensed and averaging over a preidentified time period. These new calculated variables may be placed in a separate column for further analysis.

Example of Resulting Patient Counts After an Initial Data Analysis and Then the Complex Data Analysis
Step 6: Apply the Appropriate Statistical Tests Lastly, the appropriate statistical tests are applied to evaluate the significance of the differences. When reporting study results, it is important to recognize the difference between statistical and practical significance of the findings. 6 A study may show statistical significance, but it is always prudent to ask "What are the practical outcomes of this study?" as it relates to the population in question.

Applications
While one or two database studies provide a snapshot in time, a series of database study results provide a tool to track trends over time. If members are enrolled with the same managed care plan over time, long-term costs and clinical outcomes for chronic diseases, and the impact of different factors (e.g., pharmacotherapy, physician specialties) on disease outcomes may become apparent. Table 4 provides some managed care applications of database studies.
A series of results allows a greater understanding of the overall market, including topics such as how medications are used to treat a disease, markers of concomitant medication use, patient compliance with prescribed therapy, typical morbidity markers for the disease, and concomitant disease processes. A comprehensive understanding of the market may help to predict the economic impact of a newly approved medication whether it is from the perspective of the clinical pharmacist, the president of the managed care plan, or the pharmaceutical manufacturer.

ss Summary
Managed care pharmacy has used administrative databases for drug utilization review purposes since online adjudication has been in existence. As the information collected from health care encounters becomes more readily available, managed care pharmacy will have ever greater insights about pharmaceutical impact on total patient outcomes. With limited resources, database research in health plans is conducted to provide information that may help to shape cost-saving measures or to improve the care or delivery of care for patients. Administrative databases are a useful source of data for retrospective studies that evaluate the effects of policy change on new programs and pharmaceutical therapy. Since administrative databases were not designed for research purposes, care must be exercised to overcome their limitations and to ensure credible results when outcomes for different groups of patients are being compared.   • Identify patient populations that may be targeted for patient educational programs • Identify therapeutic areas where patients and the health plan would benefit from prospective clinical programs • Manage areas of high cost • Identify prescribing patterns that fall outside of best practices • Alert a prescriber of patient behavior (e.g., controlled substance drug utilization review)